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Abstract 

y—i This paper aims to give an overview of the current state of fault-tolerant quantum comput- 

ing, by surveying a number of results in the field. We show that thresholds can be obtained 
for a simple noise model as first proved in [AB97, Kit97, KLZ98], by presenting a proof for 
^ statistically independent noise, following the presentation of Aliferis, Gottesman and Preskill 

[AGP06]. We also present a result by Terhal and Burkard [TB05] and later improved upon 

^ by Aliferis, Gottesman and Preskill [AGP06] that shows a threshold can still be obtained for 

local non-Markovian noise, where we allow the noise to be weakly correlated in space and 
time. We then turn to negative results, presenting work by Ben-Aroya and Ta-Shma [BT11] 
who showed conditional errors cannot be perfectly corrected. We end our survey by briefly 
f— ( mentioning some more speculative objections, as put forth by Kalai [Kal08, Kal09, Kalll]. 

Oh 

1 Introduction 

We have come to take for granted that our modern (classical) computers can perform complex 
computations for hours, days or weeks on end without failing. We say that such implementations 
of classical computation are essentially perfect. For the successful implementation of a quantum 
y—i computer, however, we will have to guard against noise impacting our computation. This paper 

discusses several results in the area of quantum error-correction and fault-tolerant quantum com- 
puting (FTQC). We assume familiarity with the basic principles of quantum computing (see, e.g., 
[NC00] or [dWll] for an introduction) as well as some knowledge of linear algebra and discrete 
mathematics. 

When dealing with classical computation, we often use the Turing Machine (TM) model of 
computation. There is a quantum analogue to the TM called the Quantum Turing Machine 
(QTM), but it is highly complex and not nearly as intuitive for quantum computing as the TM 
is for classical computing. Instead of the QTM model, it is standard when dealing with quantum 
computing to look at the quantum circuit model. As with classical (Boolean) circuits, a quantum 
circuit is built up from a variety of gates. Instead of logical (Boolean) gates such as AND or OR, 
quantum circuits contain quantum gates; unitary operations on a fixed number of qubits, usually 
1, 2 or 3. 

These gates can be executed sequentially, corresponding to the ordinary product of the unitaries. 
They can also be executed in parallel, on different input qubits, corresponding to the tensor product 
of the unitaries. Naturally there are infinitely many different quantum gates. We usually assume 
that we make use of only a finite number of gates which suffice to approximate any unitary gate. 
It is not important to our discussions exactly which so called universal set of gates is used, as 
long as the set is finite. Common examples of universal sets of gates include the set containing 
the Hadamard gate (H) and the Toffoli gate (T) or the set containing the Hadamard gate, the 
CNOT gate and the n/8 gate, ( Q e i°/<i). The longest time required to execute any of the gates in 
the universal set is called the fundamental gate time. 

The input to a circuit is conceptually divided into a number of registers. We sometimes need 
to make use of temporary qubits, prepared in a fixed known state (usually |0)), that we dispose of 
immediately after use. Such temporary qubits that are not considered part of any input register 



X 



1 



are called ancilla qubits. The quantum circuit we wish to implement may contain a number of 
measurements of one or more qubits. However, we may and will assume that those measurements 
are postponed until the very end of the circuit, see, e.g., [NCOO, Section 4.4]. 

The noise we have to protect our quantum computations from can be viewed in two ways. First 
we consider noise as impacting qubits while they are not being acted upon, the so-called storage 
errors, or errors for short. We then look at noise as something that impacts the performance of 
our quantum gates. Such noise is said to introduce faults in our gates. For a description of the 
various types of noise that can impact a quantum computation and their physical motivations, see, 
e.g., [Alii 1 , Sections 1.2.2.1, 1.2.3.1]. The term fault-tolerant quantum computing (FTQC) 
is used to describe quantum computers that are capable of dealing with faults and errors without 
yielding incorrect answers. This is quite different from the way we consider classical computation, 
which we called essentially perfect. A computer being perfect means that it is insusceptible to 
noise, rather than being capable of mitigating its harmful influences. 

Let us first consider storage errors in the classical case. When we limit ourselves to a single bit, 
the damage any noise can do is quite restricted. Either the bit is left intact or it is flipped, i.e., 
becomes 1 or 1 becomes 0. When we consider a string of bits, however, more variation is possible. 
Any combination of bits in the string could be flipped. There could be some imaginary adversary 
that decides which bits to flip, but for now we will limit ourselves to a simpler noise model. In this 
model each bit of a binary string x is flipped with some probability p independent of the others. 
We refer to this model as independent noise or a bit-flip channel. Suppose this noise were to 
impact our bitstring x of length n. Then with probability (1 — p) n the string is left intact and with 
probability 1 — (1 — p) n at least one bit is flipped. Without error correction we have no way to 
detect which bit or bits have been flipped and so with probability 1 — (1 — p) n we cannot recover 
the intended state of the string. 

The solution is to encode the intended state, adding redundant information. That way we can 
tolerate a portion of the information being lost while still being able to recover the original string. 
Fortunately in the classical world we can clone information, so we can, for example, copy each bit 
in our string several times in the hope that few of the bits will be damaged. We can then decide 
what the original bit was by taking the majority value of our copies. More formally we define an 
encoder C : {0, 1} -> {0, l} 3 that encodes one bit into three bits by C(0) = 000 and C(l) = 111. 
We refer to bits we wish to encode (0 and 1) as logical bits and to the bits that are stored or 
transmitted (C(0) and C(l)) as physical bits. To recover the logical bits from the physical bits 
we need a decoder D : {0, l} 3 — > {0, 1} which will output the majority value of its input. Define 
the Hamming weight of x E {0, 1}™ as \x\ — \{xi | 1 ^ i ^ n and X{ — 1}|, i.e., the number of Is 
in x. Now for x G {0, l} 3 we define D(x) = if \x\ < 1 and D(x) = 1 if \x\ ^ 2. 

Now let us analyze what happens when C(0) or C(l) are exposed to the independent noise. So 
long as at most one bit of C(0) or C(l) is flipped, our decoder D will return us to the intended 
state. The intended state becomes unrecoverable when at least 2 bits are flipped. Because each 
bit is flipped with probability p independent of the others, the probability that at least 2 bits are 
flipped is 3(1 — p)p 2 +p 3 , i.e., the probability that 2 or 3 bits are flipped. This by itself does not 
mean that we have increased the probability of recovering our string. For that to have happened 
it must be that 3p 2 — 2p 2 < p. This will certainly be the case if 3p 2 < p, which happens when 
p < 1/3. We refer to 1/3 as a threshold value. Once we can get the noise rate below it we are 
certain we can encode information to increase the chance of recovering from the noise. In fact we 
can bring the probability of recovering the intended state arbitrarily close to 1 by repeating the 
encoding procedure an arbitrary number of times: we can consider each of the bits of C(0) and 
C(l) as logical bits themselves and encode them using C. This process is called concatenation. 

In the quantum case even a single qubit can be exposed to a continuum of different errors. This 
is easy to see when we write a qubit \ip) as a |0) + ft |1). So long as \a\ 2 + \/3\ 2 = 1 we still have 
a qubit and there are uncountably many pairs (a, /3) that satisfy the equation. For now we will 
assume that noise acting on a single qubit is some unitary operator. It is common to write such 
unitaries in the Pauli basis, given by 
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Every 2-by-2 matrix can be written as a linear combination of these four matrices. In particular 
we refer to unitary errors of this form as Pauli errors. We note that Y = iXZ, so apart from a 
global phase of i we can act as though Y = XZ. 

Just as before we must encode our quantum states if we wish to guard against noise. We 
formalize the concept of encoding quantum states as follows. 

Definition 1.1. Given a state space M with dim(A/") = 2™, we call a space M C Af an [n, fc] Quan- 
tum Error Correcting Code (QECC) encoding k qubits into n qubits if dim(M) = 2 k . Associated 
with each QECC is a map from k-qubit states \x) to their encoded n-qubit states \x) . 

We call a vector in A4, i.e., an encoded quantum state, a codeword and we refer to M. as a 
QECC or as the code space. As an example, we will analyze what happens when a qubit that 
is encoded using a [9, 1] QECC called the Shor code is exposed to a unitary storage error. The 
Shor code encodes |0) as ^|(|000> + |111))® 3 and |1) as ^(|000) - |111))® 3 . For b G {0, 1} we will 

denote the encoding of \b) by \b). 

Let us consider what happens when an X error hits one of the qubits of |&). It can hit any 
of the nine qubits, so for 1 $C k ^ 9, let X k be the operator that applies X to the fc'th qubit of 
| b) and I to the others. So if an X error hits 6), we are left with X k |&) for some k. We can 
detect which of the nine qubits was subjected to the X, i.e., we can determine k. This can be done 
without collapsing the state, see, e.g., [NC00, Chapter 10] for details. We can store the location 
k of the affected qubit using four ancilla qubits. By convention if no X error has occurred we let 
k = and we define X° as the operator that applies I to all qubits in the state. If |6) is struck 
by a Z error, then one of the three blocks ( 1 000) ± |H1)) will have a different sign than the other 
two. Note that it does not matter which of the qubits in the block was hit by the error, as the 
effect is the same. So we can for 1 ^ £ ^ 3 define Z e to be the operator that applies Z to some 
qubit in the £'th bock of 15) and / to the others. So if a Z error hits we are left with Z e |&) for 
some £. As with X errors we can detect which block was hit by a Z error, i.e., we can determine 
£. This information can be stored using two more ancilla qubits. As with X, we let £ — if no Z 
error has occurred and define Z° to be the operator that applies / to all the qubits in the state. 

So if our state \b) is hit by an X or Z error, i.e., has turned into X k Z e |&) for some k G {0, ... 9} 
and £ G {0, . . . , 3} we can detect these errors and write k and £ into ancilla qubits to obtain 
X k Z e |&) \k) \£). This procedure is called error detection and we refer to the pair (k 7 £) as the 
error syndrome. We now measure the ancilla qubits to obtain the error syndrome. We correct 
the errors by applying another X k to the state and applying a Z to some qubit in the ^'th block, 
say the first. This is called error correction. As discussed before, a Y error hitting a qubit 
is the same, modulo a global phase of i, as both an X and a Z error hitting that qubit. So if 
we define Y k as the operator that applies Y to the fc'th qubit and I to the others we can say 
that Y k |&) = iX k Z i where £ is the block containing the fc'th qubit. We can perform the 
error detection and error correction steps to obtain, after discarding the ancilla qubits, the state 
i 1 6). Note that measuring this state gives the same probability distribution as measuring |&), 
so we are safe to ignore this global phase and say that we can also correct Y errors. Also note 
that we can trivially correct an / error, as this leaves the state intact. When we perform the 
error-detection step, the error syndrome will be (0,0) and performing X° and Z a has no effect, 
so the state remains correct. After the error-detection and error-correction steps the Shor code 
discards the ancilla qubits that held the error syndrome. This is a way to remove from the system 
the entropy introduced by the errors. As such the Shor code requires a constant fresh supply of 
properly prepared ancilla qubits. In fact such a constant fresh supply of ancillas is a prerequisite 
for implementing any QECC. 

Furthermore, since any 2-by-2 matrix M can be written as a linear combination of the Pauli 
matrices, i.e., M = ail + ajl + ayY + azZ, we can correct any error hitting a single qubit. 
To see this note that when the fc'th qubit of a state \b) is subjected to M we obtain after the 
error-detection step the following state, where I is the block containing the fc'th qubit. 

a/ \b) |0) |0) + a x X k \b) \k) |0) + a Y X k Z e \b) \k) \l) + a z Z e \b) |0) \£) 

Note that we ignored the global phase of i for the Y error. Measuring the ancilla qubits, this state 
collapses to one of the four terms. Each of those we can correct to recover \F). 
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Faults can arise as the result of an imperfect implementation of a quantum gate. They can also 
be the result of an imperfect preparation of a register qubit. The input to a quantum circuit is 
given classically and the quantum computer must encode this classical state into the register qubits 
before the circuit can be executed. Errors may also arise as the result of imperfect measurements. 
We call a gate that performs a different operation than intended a faulty gate. Noise is to blame 
for gates being faulty, so we say that noise introduces faults in gates. As hinted to above, we can 
use QECCs to protect gates from faults induced by noise. In particular we can prevent gates from 
spreading errors in their input to errors in their output too much. Such implementations of gates 
are called fault-tolerant (implementations of) gates. 

The way circuits are constructed, with both parallel and sequential executions of gates, it is 
very common for some qubits or even entire registers to remain resting for large portions of the 
circuit's execution. Resting in this case means that there are no gates acting on them. The longer 
a qubit is resting, the more likely it is to be hit by storage errors. We can however consider such 
qubits to be acted upon by identity gates (/) and we can create fault-tolerant implementations of 
identity gates to guard against storage errors. This shows how we can think of errors as faults. 
We can also think of faults as (storage) errors by thinking of a faulty gate as an ideal gate followed 
by some (not necessarily unitary) error operator. That error operator can then be seen as causing 
a storage error. This illustrates how we can think of noise as something that causes errors or as 
something that causes faults (or both) , whichever way of thinking is more convenient for us at any 
given time. 

The simplest noise models in terms of analysis are independent noise models, where each qubit 
is hit by an error independent of the others. It is generally believed, however, that such noise 
models are not physically realistic. It is assumed that in physically realistic models the noise will 
be correlated, either in time, in space or in both. It may also be possible that the noise does not 
act the same on each term of a state in superposition, something that we do assume with the 
independent noise model. This paper aims to give an overview of the current state of FTQC, of 
noise models for which we have threshold results as well as types of noise for which no threshold 
can be obtained. 

This paper is structured as follows. We start with some positive results, i.e., showing that 
FTQC is possible provided the noise levels are low enough. In Section 2 we show this for an 
independent noise model using the framework and method presented in [AGP06]. In Section 3 we 
show it for a "local non-Markovian" noise model, as first done by Terhal and Burkard in [TB05] 
and later improved upon by Aliferis, Gottesman and Preskill in [AGP06]. Then in Section 4 we 
turn to some negative results by Ben-Aroya and Ta-Shma, who showed in [BT11] that certain 
types of errors cannot be corrected by any QECC, although some errors can be approximately 
corrected. In Section 5 we turn more speculative objections to FTQC as put forth by Kalai in 
[Kal08, Kal09, Kalll]. Finally we conclude in Section 6. 

2 A threshold result for independent noise 

In this section we will describe a general framework for fault-tolerant quantum computation in 
the face of independent noise. This result was first proved in [AB97, Kit97, KLZ98], but here we 
will follow the presentation of [AGP06]. The next section will deal with a more challenging noise 
model. 

The goal is to create (and prove correct) a fault-tolerant implementation of an arbitrary quan- 
tum circuit. That circuit we shall refer to as the ideal circuit and denote by M . We start by 
dividing Mq into a set of locations, each corresponding to a single gate, qubit preparation or 
measurement in the circuit. Note that we consider a resting qubit (i.e., one that is not being acted 
upon by a gate) to be acted upon by the identity gate. Thus each time interval where a qubit is 
resting is divided into a number of locations corresponding to identity gates. Note that we may 
treat each location that corresponds to the application of a gate as corresponding to a time interval 
of length tg, the fundamental gate time. 

Now the QECC comes into play. Let C be a QECC that encodes one (logical) qubit into m 
(physical) qubits. We refer to a set of m qubits that are the encoding of a single qubit by C as a 
1-block. We will encode (here the term is used informally) each location into a group of locations 
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called a rectangle. There will be rectangles for preparing register qubits, qubit measurements 
and the application of gates. When all the locations in Mq are replaced by rectangles we obtain a 
new circuit which we shall call M\. 

In our ideal circuit Mq we will prepare register qubits in some basis, say the computational 
basis. Where in Mq we would simply be supplied with a |0) qubit, in Mi we will need an encoded 
|0), C(|0)). A qubit preparation rectangle is thus a rectangle that provides us with C(|0)). 
Furthermore we assume that the rectangle contains circuitry for performing the error-detection and 
error-correction steps as described in Section 1 after the C(|0)). We refer to such error-detecting 
and error-correcting circuitry that corrects errors in a 1-block as a 1-EC, for Error Correction. 
Note that we will not replace the preparation of the ancilla qubits used to hold the error syndrome 
by the QECC, only the preparation of register qubits. 

Similarly we must be able to measure the logical value of a 1-block in Mi, in other words we 
must be able to decode a 1-block to obtain the measurement outcome had we measured in Mq. For 
this operation we use qubit measurement rectangles. Depending on the QECC used and the 
rectangle design this might be as simple as measuring each qubit in the 1-block and then taking 
the (recursive) majority. Since the measurement outcomes are classical and we can classically 
derive the logical output from the measurements there is no need for a 1-EC in the measurement 
rectangles. Naturally we assume that classical information storage and computation is perfect. 

Each gate in a circuit is replaced by a gate application rectangle, which consists of a fault- 
tolerant implementation of the gate, called a 1-Ga for Gate, followed by a 1-EC. Depending on the 
code used we may have to require that the gates of the ideal circuit Mq are gates in a particular 
universal set of gates. In this discussion however we will not fix such a set. We call a group of 
locations in Mi that are the encoding of a single location in Mq, i.e., that make up the rectangle 
for that location in Mq, a 1-Rectangle or 1-Rec for short. 

The procedure of encoding a single (logical) qubit into a rectangle consisting of (physical) qubits 
can be repeated many times, by considering the qubits that make up the rectangles as logical qubits 
themselves and replacing each by a rectangle as before. Thus we can have that each location in 
Mq is encoded by a rectangle consisting of locations in Mi that are each encoded by a rectangle 
consisting of locations in M2 and so on. To reason about such recursive encodings we shall extend 
our definitions somewhat. 

A set of qubits in M r that are the (concatenated) encoding of a single qubit in M r _ s is called 
an s-block in M r . We have already seen a 1-block in Mi, which corresponded to a single qubit in 
M]_i = Mq and by the nature of C thus consisted of m qubits. At this level we can say the 1-block 
contains physical qubit that encode a single logical qubit of Mq. Similarly a 1-block in M2 will be 
the encoding of a single qubit in M 2 _i = Mi and will also be m qubits. From this perspective we 
can consider the 1-block to consist of the physical qubits that encode a single logical qubit of Mi . 
A 2-block in M2 however will be the encoding of a single qubit in A/2-2 = Mq, which corresponds 
to m qubits in Mi , each of which becomes m qubits in M2 , thus the size of a 2-block in M2 is m 2 
qubits. In general an r-block in M r consists of m r qubits. Here we can say that the m 2 qubits of 
M 2 are the physical qubits for m logical qubits in Mi, which are themselves physical qubits for a 
single logical qubit in Mq. So what we call physical or logical qubits in M^ depends on whether 
we look 'down' to Mk+i or 'up' to Mk-i- 

Similarly we call a group of locations in M r that are the (concatenated) encoding of a single 
location in M r _ s an s-Rec in M r . We have seen that a 1-Rec in Mi is a rectangle as it corresponds 
to a single location in Mi-i = Mq. Similarly a 1-Rec in M2 would correspond to a rectangle for 
a location in Mi and a 2-Rec in M 2 corresponds to the set of rectangles for the locations in Mi 
that make up a rectangle for a single location in Mq. In general it may be difficult to calculate 
the precise number of locations in an r-Rec in M r , but if we let L be the maximum number of 
locations in a rectangle we can give an upper bound as L r . We could also generalize our definitions 
of a 1-EC and a 1-Ga, but we will not often need to refer to s-ECs or s-Gas in this discussion. 
Figure 1 illustrates some of these key definitions. 

Since each 1-Rec in a circuit ends in a 1-EC, with the exception of measurement rectangles 
which we assume to only occur at the very end of the circuit, each 1-Rec is also immediately 
preceded by a 1-EC. We call a 1-Rec together with the 1-EC that immediately precedes it a 1- 
exRec for 'extended rectangle'. Similarly we can define an s-exRec in M r as an s-Rec in M r 
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M k+1 

(k + 1)-Rec in M fe+1 
1-Rec in M^+i 
Physical gate in M k +i 



Figure 1: The relation between the level-fc encoding of a circuit and the level-(& + 1) encoding of 
the same circuit. Note that a single 'logical' gate in M k is encoded by a 1-Rec in M k +i- This 
'logical' gate takes a single qubit as input, so the corresponding 1-Rec takes a 1-block as input. 
The gates that make up each 1-Rec in M k +i are called 'physical' gates. 

with its preceding s-EC. 

We are assuming that C is a QECC that can correct a single unitary error, which means we 
can construct the 1-ECs and 1-Gas in 1-Recs such that the following conditions are met: 

1. If a 1-EC contains at most one fault, then it takes any pure state input to an output in the 
code space. 

2. If a 1-EC does not contain a fault, then it takes any pure state input with at most one error 
to an output with no errors. 

3. If a 1-EC contains at most one fault, then it takes a pure state input with no errors to an 
output with at most one error. 

4. If a 1-Ga contains no fault, then it takes a pure state input with at most one error to an 
output with at most one error in each output block. 

5. If a 1-Ga contains at most one fault, then it takes a pure state input with no errors to an 
output with at most one error in each output block. 

See [Alill] for a general discussion of rectangle design and [AGP06, Sections 7 and 8] for an explicit 
construction of rectangles that satisfy these conditions. 

We say that a 1-exRec is good if it is hit by at most one fault and that it is bad if it is hit 
by at least two. The idea is that a good 1-exRec will leave at most one error in its output. We 
call two bad 1-exRecs independent if they do not overlap, i.e., do not share a 1-EC, or if they 
do overlap and the first 1-exRec would still contain at least two fault if we do not count the faults 
in the shared 1-EC. We define goodness and badness for higher levels of concatenation recursively. 
A fc-exRec is good if it contains at most one bad (k — l)-exRec and bad if it contains at least two. 
Analogously to the 1-exRec case we call two bad £;-exRecs independent if they do not overlap, i.e., 
do not share a fc-EC, or if they do overlap and the first fc-exRec would still contain at least two 
bad (k — l)-exRecs if we do not count the (k — l)-exRecs in the shared k-EC. 

Our strategy for proving the threshold result consists of three stages. First we show that if a 
fc-exRec is good then the k-Kec it contains will be 'correct'. Secondly we show that if all /c-exRecs 
in Mk are good, the probability distribution of a measurement of Mk will be the same as that of 







Mq. Finally we show that the number of bad fc-exRecs decreases doubly exponentially as the level 
of encoding (fc) increases polynomially. We conclude with the threshold result. 

For the first step in our proof we need to define what it means for a fc-Rec to be correct. To 
this end we introduce the concept of an ideal fc-decoder in Mk, which we define recursively. An 
ideal 1-decoder in M\ takes a 1-block as input, performs the error-detection and error-correction 
steps, i.e., a 1-EC, and outputs a single decoded qubit. An ideal fc-decoder in Mk takes a fc-block 
as input and first runs ideal (fc — l)-decoders on each of the (fc — l)-blocks of its input and then 
uses an ideal 1-decoder on the resulting 1-block. The decoder is called ideal because we assume 
that it contains no faults, hence it is only a theoretical device. Note that these decoders are not 
part of the actual fault-tolerant circuit, they are only used for the analysis. 

We can now say that a fc-Rec for the application of a gate is correct if the fc-Rec followed 
by an ideal fc-decoder is equivalent to the ideal fc-decoder followed by the ideal gate it is meant 
to implement. A fc-Rec for qubit preparation is called correct if the fc-Rec followed by the ideal 
fc-decoder is equivalent to the qubit preparation the fc-Rec is meant to implement. Finally a fc-Rec 
for qubit measurement is correct if the fc-Rec is equivalent to the ideal fc-decoder followed by the 
measurement the fc-Rec is meant to implement. Thus we can see that a correct fc-Rec allows its 
output state to be successfully decoded by some ideal decoder. We are now ready to prove our 
first lemma. 

Lemma 2.1 ([AGP06, Lemma 3]). Assume conditions 1-5. For fc ^ 1, if a k-exRec is good then 
the k-Rec it contains is correct. 

Proof. We prove this by induction on k. For fc = 1 we first consider a 1-exRec for a gate application. 
Because the 1-exRec is good it contains at most one fault. If it contains no faults the result is 
immediate. If it contains one fault we make a case distinction on the location of the fault. 

• If the fault is in one of the 1-ECs in front of the 1-Rec, then by condition 3 its output contains 
at most one error. The output of the other 1-ECs is in the code space by condition 1. So the 
pure state inputs to the 1-Ga contain no errors, i.e., they are all in the code space. Now by 
condition 4 the output of the 1-Ga contains at most one error in each output block and by 
condition 2 this error is corrected by the 1-ECs that follow the 1-Ga. 

• If the fault is in the 1-Ga, then the 1-ECs preceding it have all output codewords by condition 
1. By condition 5 therefore the output of the 1-Ga has at most one error in each output 
block, which is corrected by the 1-ECs following it by condition 2. 

• If the fault is in one of the 1-ECs after the 1-Ga, then the 1-ECs preceding the 1-Ga have all 
output codewords by condition 1. Now by condition 4 the output of the 1-Ga has no errors. 
By condition 3 the output of the 1-Rec now contains at most one error. 

A similar argument goes for a 1-Rec for qubit preparation. Either the fault lies in the preparation, 
in which case it is corrected by the 1-ECs that follow it, or the fault is in one of the 1-ECs in which 
case condition 3 ensures the output contains at most one error. For 1-Recs for qubit measurement 
the fault can only lie in one of the preceding 1-ECs, in which case the ideal 1-decoder will correct 
it. 

We only show the inductive step for (fc + l)-exRecs for gate applications, those for qubit 
preparations and measurements are done in a similar fashion. We need to show that (fc + 1)- 
exRecs followed by an ideal (fc + l)-decoder are equivalent to the (fc + l)-ECs followed by an ideal 
(fc + l)-decoder followed by the gate the (fc + 1)-Rec is meant to implement. By the definition of 
an ideal (fc + l)-decoder we can view it as a number of fc-decoders followed by a 1-decoder. Note 
that each such fc-decoder is preceded by a fc-Rec, namely those the (fc + 1)-Rec is made of. Using 
the induction hypothesis we can move the fc-decoders in front of these fc-Recs, leaving them as 
the ideal 1-Recs they are meant to implement. Now by the base case each 1-Rec followed by a 
1-decoder is equivalent to a 1-decoder followed by the gate the 1-Rec is meant to implement. 

Now our circuit has the following shape. First there are a number of (fc + l)-ECs, then the 
fc-decoders, then 1-decoders and finally the gate our (fc + l)-exRec was meant to implement. But 
again by the definition of ideal decoders, this is equivalent to the (fc + l)-ECs followed by a (fc + 1)- 
decoder followed by the gate the (fc + l)-exRec was meant to implement. This completes the 
proof. □ 
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For the second part of the proof we use the following short lemma. 



Lemma 2.2 ([AGP06, Lemma 4]). Assume conditions 1-5. If all k-exRecs in Mk are good, then 
Mk has the same probability distribution on its outcome as Mo. 

Proof. By Lemma 2.1 all fc-exRecs being good implies that all the fc-Recs contained in them are 
correct. In particular all qubit-preparation fc-Recs output at most one error and all the fc-Recs 
for gate applications do no spread this error. Thus at most one error per block arrives at the 
fc-Recs for qubit measurement at the end of the circuit and these fc-Recs perform the measurement 
faithfully, i.e., without faults. □ 

Our last lemma will show that there are few bad fc-exRecs. 

Lemma 2.3 ([AGP06, Lemma 2]). Let A be the largest number of pairs of locations in any 1- 
exRec. Assuming a noise model where faults occur in a location within a k-exRec with probability 
e independently, the probability e( fe ) that a k-exRec is bad satisfies 

e(«0 < 

A ' 

Proof. The probability that any given pair of locations in a 1-exRec is faulty is bound by e 2 , 
because the faults are independent. Thus the probability that a 1-exRec is bad, i.e., contains at 
least two faults, is ^ Ae 2 . Similarly a fc-exRec is bad if it contains at least two bad (fc — 1)- 
exRecs. The events of any two (fc — l)-exRecs being bad is also independent, so the probability 
that a fc-exRec is bad is e^ k ' ^ j4(e( fe-1 )) 2 . Solving this recursion gives us the desired bound. □ 

We can improve this bound by noting that if a fc-exRec contains two bad (fc — l)-exRecs that 
are not independent, the fc-exRec can still be considered good. The analysis required to arrive at 
such a better bound is carried out in [AGP06, Section 5.2.1]. It is also worth noting that some 
pairs of locations are benign, in the sense that if such a pair is faulty the fc-exRec can still be 
correct. A sharper bound can be obtained by not counting such pairs, see [AGP06, Section 6] for 
the revised argument. 

It is still clear from the result presented here that if e < 1/A, then the expected number of bad 
exRecs decreases doubly exponentially as fc increases. We will now use this consequence of the 
lemma to prove the threshold result. 

The threshold result will show that we can reduce the computation error of any quantum 
computation to below an arbitrarily small amount. To formulate the theorem we must first define 
what we mean by the computation error. Given the probability distributions P — {pi} and P' = 
{p'A- of the measurements of two quantum computations, we define the Li-distance between them 
as Yli \Pi ~ Pi\- Note that if P = P', then the Li-distance between them is 0. The computation 
error of a quantum computation is now defined as the Li-distance between that computation and 
the ideal computation. 

Theorem 2.4 ([AGP06, Theorem 1]). Assume conditions 1-5. Let A be the largest number of 
pairs of locations in any 1-exRec and assume a noise model where faults occur at a location with 
probability e independently. If e < 1/A, then for any S there exists a level k such that Mk simulates 
a given circuit M with computation error at most 8. 

Proof. Let p( ldcal ) be the probability distribution of the outcome of a measurement of the ideal 
circuit Mq and let p( actual ) be that of a measurement of the circuit Mk- We define S to be the 
L\ distance between these distributions, i.e., S := J2i p| actua1 - 1 — p( ldcal ) _ Let L be the number of 
locations in Mq and note that each such location is encoded in Mk by a fc-Rec. 

Our computation will succeed if there are no bad fc-exRecs Mk, but it might fail if there are. 
There are L fc-Recs in Mk and by Lemma 2.3 the probability that a fc-Rec is bad is bound by e^ h \ 
So by the union bound we have that the probability that at least one fc-Rec in Mk is bad is 

p (k) < L ik) < L{Aef 

Mail ^ £ ^ ^ 
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Let us call the averaged probability distribution over the outcomes of computations with at least 
one bad k-Kec p( fal1 ) . Naturally by Lemma 2.2 we know that if all fc-exRecs are good, then 

p (actual) = p(idcal)^ Thig ^ ug write 



P 



(actual) 



(k)\„ (ideal) 



P, 



WJfail) 



Which gives us 



r \ T (actual) (ideal) 

6 = 1^ \Pi ~ Pi 



(i-4!!)E 



(ideal) (ideal) 

-Pi 



p(k) 

-"fail 



(fail) (ideal) 
Pi 'Pi 



n _i_ p( fc ) L( fail ) „( idcal ) 

U + Mail \Pi ~ Pi 



< IP 



(fc) 



fail 



2L{Aef 



where the first inequality is because the maximum L\ distance between any two probability dis- 
tributions is 2. We can rewrite this inequality to see that we can pick k such that 



2 k > 



to achieve an error less than or equal to 5. 



□ 



The crucial observation is that k scales at about log log (1/5), meaning that the error can be 
doubly exponentially reduced by only increasing the level of the simulation linearly. 

We have already hinted at several possible optimizations for this result by refining the analysis, 
such as counting only pairs of independent bad fc-exRecs. In this paper we only consider QECCs 
that can correct 1 error, but there are also QECCs that can correct more errors. Building fault- 
tolerant computers using such QECCs can yield better thresholds, see, e.g., [AGP06] or [PR11]. 
Another way to bring fault-tolerant quantum computing forward is by showing that threshold 
results exist for a wide variety of different noise models. This section dealt exclusively with the 
simplest possible noise model in terms of analysis; stochastically independent noise. In the following 
section we will show that threshold results can also be obtained for slightly less favorable noise 
models. 



3 A threshold result for local non-Markovian noise 

One of the properties that makes the independent noise model we have looked at so far easy to 
analyze, is that the errors (or faults) it introduces are not correlated in space or in time. In other 
words, an error is just as likely to occur at a location close to where another error occurs as it is to 
occur anywhere else in the circuit. The same goes for temporal correlations: there are none. These 
restrictions may be physically unrealistic. Removing them and allowing noise to be correlated in 
time and space is a first step towards adversarial noise models. This section presents a threshold 
result shown in [AGPOG, Section 11] and [TB05] for such a noise model. 

In the previous sections we have looked at quantum circuits as closed systems, somehow isolated 
from their environment. The evolution of any closed quantum system over time is governed by the 
Schrodinger equation, 

where h is Planck's constant and H is a Hermitian matrix called the Hamiltonian of the system. 
The Hamiltonian is the observable for the total energy of the system. We assume for now that 
H is time-independent, but it is also possible to reason about time-dependent Hamiltonians, i.e., 
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Hamiltonians that are parametrized by a time variable. When we reason about small enough 
time intervals though, we can treat the Hamiltonian for a single time interval as one that is not 
time-dependent. To compute the evolution of the system over some time period, say for the time 
interval (ti,t 2 ), we solve the Schrodinger equation to obtain 

|V(* 2 )> =e"' C V 1,H \<p(ti)). 

It is customary to absorb 1/H into H, so we can write 

W*2)> = e-tto-W \<p{h)) . 

By linear algebra it can be shown that if H is a Hermitian matrix, then e lH is a unitary matrix. 
In particular e - l (*2-*i)ff — U(t 2 ,ti) for some unitary operator U(t,2,t\), which is called a time- 
evolution operator. This explains why we can model quantum computing by quantum circuits 
consisting of unitary operators. See, e.g., [NCOO, Chapter 2.2.2] for more details about the relation 
between the Schrodinger equation and the quantum circuit model. 

As opposed to the previous section we will now take the system's environment, which we also 
refer to as a bath, into account. Because the Schrodinger equation only applies to closed quantum 
systems, this means that the Hamiltonian must also describe the evolution of the bath and the 
interaction between the bath and the system, i.e., our circuit. In general the Hamiltonian for of 
the system and bath may be time-dependent. We may express the time-dependent Hamiltonian 
H(t) of our system and bath as 

H(t) = H s (t) + H 8B (t) + H B (t), 

where H$(t) is the Hamiltonian for the system in isolation, that for the bath in isolation 

and Hssit) that for the interaction between the system and the bath. We refer to the latter as 
the interaction Hamiltonian. So far we have not placed any restrictions on the noise. We limit 
the power of the noise by requiring that the interaction Hamiltonian has the form 

HsB{t) = ^2 HsB,a, 

aeA t 

where each a £ A t is a set of qubits that are acted upon by the same gate in the circuit at time t. 
For example if q\ and q 2 are acted upon by a CNOT gate at time t, then {91,92} € A t . Another 
example would be if a qubit 93 is resting at time t' (remember this is equivalent to an / gate acting 
on it), then {93} £ A t i. We call a pair (a, t) such that a € A t a microlocation. Note that this 
does not correspond to what we called a location in Section 2, but that a location in that sense 
does consist of a number of microlocations as described here. 

This restriction on the interaction Hamiltonian limits the power of the noise in the sense that 
errors can only be correlated when the qubits involved are already being correlated by the circuit. 
Since each gate in the circuit typically operates on few qubits (1, 2 or 3) for short periods of time, 
this model allows for weak spatial and temporal correlations. Long-range correlations (both in 
space and time) are still possible, because the interaction Hamiltonian can move information from 
one place in time (space) to another via the bath. We will see, however, that the influence of such 
indirect correlations does not stand in the way of obtaining a threshold. Still the bath can be seen 
as having a 'memory', even though the noise it produces is highly localized in nature. Informally 
we can say that a 'memoryless' process is a Markovian process. This is why we refer to this noise 
model as local non-Markovian noise, because it is a process that has a 'memory' and acts 
locally. 

To aid our analysis we discretize the evolution of our system. Say that the total time required 
for our computation is T and let to be the fundamental gate. We now divide our total time T 
into N intervals of length A such that to ^> A, where to/ A is integer. This last condition allows 
us to safely ignore factors of 0(A 2 ) in our analysis. Because A is so small, we can act as though 
the H{t') for t ^ if ^ t + A are all approximately equal to H(t). We can then, using the Trotter 
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expansion, express the evolution of our system for the time interval (t, t + A) as 
U(t + A,t) w e - <AH(t) 

_ e -iAH s (t) e -iAH B (,t) e -iAHsB(t) 

= p -iAH s (t) -iAH B (t) -iAJ2 aeAt H SB _ a 



iAHs (t) p — iAH B (t) TJ ,,-iAHss.o 



Note that we have disregarded terms with norm of 0(A 2 ). It can be shown that for small enough 
A, the error in this approximation is small enough for our purposes. Using a Taylor expansion we 
can further simplify this to 

U(t + A, t) « e -i^Hs{t) e -iAH B (t) JJ (I _ iAHsB.a)- 

aeA t 

We can express the entire evolution of the system as a product of iV such time evolution oper- 
ators, each for a time interval of length A. Writing out this product, we obtain a sum where 
in each summand the interaction part will contain / for some microlocations and —iAHsB,a for 
others. We call the whole sum the fault-path decomposition of the computation and refer to 
a single summand as a fault path. When a factor —iAHsB,a occurs in some fault path at some 
microlocation we say that the fault path has a fault at that microlocation. 

In Section 2 we proved a threshold result by showing that if exRecs were good, then they were 
correct; that correct exRecs yield good final answers to computations and that there were few bad 
exRecs. In this section we will not only reason about exRecs, but also about fault paths. We will 
show that the norm of the sum over bad fault paths, i.e., fault paths with many faults, can be 
made arbitrarily small. Then we show that a small norm of bad fault paths lead to approximately 
correct answers for the computation. From that we will conclude with a threshold theorem for 
local non-Markovian noise. 

First we still need to define the strength of such noise, cf. the error-rate e from Section 2. We 
express this in terms of the norm of the interaction Hamiltonian. The norm of an operator A is 
defined as 

II A ||= sup "AM, 
\<p) II m II 



where || \ip) || is the Euclidean norm of a state \tp), i.e., || \ip) \\ = W ((p\tp) . We shall make use of the 
following properties of the norm. For all operators A and B we have 

\\A + B\\ ^\\A\\ + \\B\\ and || AB || < || A \\ ■ \\ B || = || A ® B || . 

Let Ao be an upper bound for the norm of the HsB,a Hamiltonians. In other words, for all times 
t and all a £ A t we have || Hsb,cl II ^ Aq. 

Because we picked A such that to/ A is an integer and to 3> A, the time spent executing a 
single gate is divided into many microlocations, which we can group into so-called locations. Note 
that these locations are the same as the locations we considered in Section 2. Given a set Ir of 
r locations we let E(Ir) be the sum of all fault paths with faults at all of the r locations in 
If for all r and all Iji with =rwe have || E(Ir) \\ ^ rf , we call r\ the noise strength. The 
motivation for this definition of noise strength is that in this model the noise is caused by energy 
being transferred from the system to the bath or vice versa. The strength of these interactions 
thus determines the strength of the noise. 

As in Section 2 we can reason about locations being hit by a fault or "being faulty" . We say 
that a location is hit by a fault if at least one of the microlocations that it is made up of is faulty. 
As before we call our ideal circuit M and replace all locations in M by 1-Recs to obtain Mi and 
so on. Note that each fault path describes a quantum evolution and can therefore be seen as a 
circuit in itself. This allows us to say that a fault path of Mi is good if each 1-Rec contains at 
most 1 fault and it is bad otherwise. In general a fault path of M& is good if each fc-Rec contains 
at most one bad (k — 1)-Rec and it is bad otherwise. We need to bound the norm of the sum of 
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all bad fault paths of Mj~ . We start by considering how to bound the norm of the sum of all fault 
paths with at least 2 faults in Mq. From this we will obtain a bound on the sum of all bad fault 
paths in Mi and from that our desired bound on the sum of the bad fault paths in . 

Let F denote the sum of all fault paths with at least 2 faults in M . We would like to express F 
as a sum of E(Ir)s, because we have a bound for their norms. Let 1 denote the set of all locations 

in Mq. We cannot say that F — Y^r=2 ^2x R cx-.\x R \=r -®C£r)> because we would be massively 
overcounting the number of bad fault paths. To see this note that a fault path with faults at all 
\I\ locations is counted for every value of r. To properly count F we need a combinatorial lemma. 

Lemma 3.1 ([AGP06, Lemma 7]). Let I be the set of all locations in Mq, then 

F = £(-ir(r-l) E E &*)- 

r=2 X R CX:\X R \=r 

Proof. We must show that every fault path with at least two faults is counted exactly once and 
that fault paths with less than two fault are not counted. The latter is easy to see, because such 
a fault path will not be in any of the E(Ir) since we start from r — 2. To see the former, let / 
be any fault path with at least two faults and say that k is the number of faults on /. First note 
that / does not occur in any E(Ir) for r > k. So it suffices to show that / occurs exactly once in 

k 

^(-lf(r-l) 

r=2 X R CX:\X R \=r 

Given 2 ^ r k, there are (^) sets Ir C X with \Ir\ = r that are a subset of the locations at 
which / has faults. Only for those Ir, f is counted in E(Xr) and it occurs there once. So the 
number of times / is counted in this sum and with that in the total sum is 

k\ v *s , „ > i k\ x ^ / - w I k 



r=2 ^ ' r=2 ^ ' r=2 



k 1 \ / v > , . „ f k \ , . n i k \ , . i i k 



r=2 v 7 \ r=0 

-*EW( fc : V(fc-i) 



/=i 
• fe-i 



fe- r 
o , 



-(fc-i) 



fe-jfe + i = i, 



where for the third and fifth equality we use the binomial theorem, which states in particular that 

We can now compute || F ||. Note that the norm of a sum is bounded by a sum of norms and 
that there are ('^') different Ir C I with \Ir\ = r. Letting \I\ = A we have 



r=2 v 7 v 7 r=2 v 



v r - 2 



A-2 
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where the last inequality is because 1 + rj e' 1 . Observe that we need to drop the negative signs 
from Lemma 3.1, because the fault paths can positively interfere with one another. The only thing 
we can safely say for any E(X R ) and E{T R ) is that || E(X R ) - E(X' R ) || sC || E(X R ) j| + || E{T R ) \\. 
This and the assumption that the noise obeys || E(X R ) || < rf explains the first inequality. 

To bound the norm of the bad fault paths in M\ we will need some additional notation. We 
let X^ R denote a set of 1-Recs in M\ with X R ^ 

paths where all of the 1-Recs in X R are bad 
2 faults. For a given X^ R we can label the bad 1-Recs it contains by b S [r] according to some 
arbitrary ordering. We then let 1(b) be the set of all locations in the 1-Rec labeled by b. In general 

r and let E(X R ) be the sum of all fault 



r and we let E(X R ) be the sum of all fault 
Recall that a 1-Rec is bad if it contains at least 



r(fc) 



we define X R to be a set of fc-Recs in Mj, with 

(k) (k) 

paths where all of the fc-Recs in X R are bad. For a given X R we label the bad fc-Recs by b <E [r\. 
We let X^ k \b) be the set of all (fc - 1)-Recs in the fc-Rec labeled by b. 

Lemma 3.2 ([AGPOO, Lemmas 8,9]). If for all r and all X R with \X R \ = r, \\ E(X R ) \\ ^ rf and 
V < (ajJa=^, > then 



E{xf) ^(r,^Y\ 



where 



(( 



r(fc) 



= r and A is the maximum number of locations in any 1-Rec. 



Proof. We prove this by induction on fc. For fc = 1 we have that E(X R ^) is the sum over all fault 
paths that have > 2 faults at every 1-Rec in X R \ Let r — X R and note that we label the 1-Recs 

in X R by 1 ^ b ^ r. Using Lemma 3.1, E(X R ) is all the fault paths that occur for every 1-Rec 
in X^ R in the expression 



\z(b)\ 

E 

h=2 



(-1)^(4-1) 



E 

J(b)CX(b):\J(b)\=e b 



E(J(b))- 



To obtain E(X i R ) ) we can thus sum over the sets of locations in each 1-Rec independently and 
consider only the fault paths with faults at all the locations in each J . So we have 



|Z(1)I 

E 

<?i=2 



(-1)^(4-1). 



E 

e r =2 



(-1)^(4-1) 



E 

l 7(l)CX(l):| l 7(l)|=4 



E E U^) 

J(r)CX(r):\J(r)\=e r \i=l / 



By our bound on the noise strength we have 



E 



\Jj(i 
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because ||J[ =1 J{i)\ ^ J2i=i The norm of E(I^) now becomes 

|I(1)I \I{r)\ 

E{1$) U ^(-l)^(^i -I)""" E (-1)^(4-1) 



4=2 



£ r =2 



E E IK 4 

1 7(l)CI(l):| l 7(l)|=f 1 J r (r)CI(r):|J(r)|=£ r i=l 



|Z(0I 



=nE(-^(4-D e ^ 

r /l-r/-M\ 

■nEi-i'ft-iH'V 

So by our previous analysis of || F || and letting A be the maximum number of locations in any 
1-Rec we obtain that 

/ \ r 



thus proving our basis case. For k + 1 we note that £'(2^ c+1 ' ) ) is the sum over all fault paths with 
^ 2 bad fc-Recs in each (fc + 1)-Rec in X^ +1 \ Reasoning as before we thus obtain 

E (4' + '»)|=n II E' )l (-i)''ft-i)( |I< T (<>1 )^'' 

i=l 4=2 ^ 1 ' 

{v ( k)) 2 e (A-2wA 



By induction hypothesis and the assumption that r\ ^ 



we have 



A^ e (A-2) J7 f ( '^ r)f) (^-2)^ 



A 



= m 



r/e 



(A-2) v 



(^) e (A-2), 

A\ (A-2)rA ; 



Now we can simplify our bound as 



£(4 fc+1) ) 



(k)\2JA~2) n 



When we fill in the value of rjf^ obtained from the induction hypothesis we get 



(fc+i) 



A 



( A )r)e( A - 2 ^ 



{ A )e{ A-2 )v 



o(A-2) v 



A\(A-2)r, 



{ A )e (A-2 )v 



thus proving the lemma. 



□ 



This lemma is in much the same spirit as Lemma 2.3 and like with that lemma the bound can 
be improved by not counting benign pairs of faulty locations, see [AGP06, Section 11.5] for details. 
We are now ready to prove our threshold result for local non-Markovian noise. 
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Theorem 3.3 QAGP06, Theorem 6]). Let A be the maximum number of locations in any 1-Rec and 
assume a noise model where for all r and all In with \Ir\ = r, \\ E(Ijj) | n r . If n < , A s A_ 2) , 

\2) e n 

then for any S there exists a level k such that Mk simulates a given circuit Mq with error at most 
6. 

Proof. Let us write the fault path expansion of M). as Gk + Bk, where Gk contains all the fault 
paths without any bad fc-exRecs and Bk contains all the fault paths with at least one bad fc-exRec. 
Let L be the number of locations in Mq and hence the number of fc-exRecs in Mk- 
We claim that 

L 



r=l 



7-CO CT(M) ■ T' fc ' 



To prove this claim we must show that every fault path with at least one fault is counted exactly 
once and that fault paths without faults are not counted. The latter is a trivial observation. To 
see the former, let / be any fault path with at least one fault and say that fc is the number of 
faults on /. First note that / does not occur in any E(Ir) for r > k. So it suffices to show that / 
occurs exactly once in 

E(-ir E £(4 fe) )- 

r=l tWctW-M*' l-r 

J. R -y- R I— I 

Given 1 ^ r ^ fc, there are (*) sets 4 fe) Q Z (k) of size r that are a subset of the locations at which 
/ has faults. Only for those I R k \ f is counted and for each of them it is counted exactly once. So 
the number of times / is counted in this sum and hence in Bk is 

;i-i' , " j (:)=n-ir l f!!j+i=-fy:H ''(':.) 1+1 = 1- 




where the last equality is a consequence of the binomial theorem. Now we can upper bound || Bi- 
as we did || .F || using Lemma 3.2 as 



r=l ^ ' r=l 

( X 7 1 )(f,W)* = ^W(i7« + l) L - 1 



1 (L- 1 

r-1 



< LV W E 



t=o 

where the last equality is again a consequence of the binomial theorem and the last inequality is 
because + 1 ^ e 1 *' . 

It is a known fact that the ii-distance between measurements of two states is at most twice 
the Euclidean distance between those states. Therefore the computation error 8, which is the 
Li-distance between the measurement of the ideal circuit and that of Mk, can be bound by twice 
the maximum Euclidean distance between the final state of the ideal circuit and that of Mk . This 
distance in turn is at most || Bk ||, therefore 

6 < 2 || B h || < 2Ln^e { - L - 1 ^ {h) < 2L V ^e^\ 

where the last inequality is because 7/*^ ^ 77 as we argued in the proof of Lemma 3.2. We can 
rewrite this inequality to see that we can pick k such that 



log 



2 fc >- VV2 



2Le( i - 1 >" 



l0 S| (A 
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to achieve an error less than or equal to i5. 



□ 



As with Theorem 2.4, k scales at about log log( 1/6). We might improve the threshold by 
considering QECCs that can correct more that one error. In [AKP06] Aharonov, Kitaev and 
Preskill prove a threshold result 1 for a non-Markovian noise model that allows interactions between 
arbitrary pairs of qubits, even if they are not correlated by the circuit. They obtain their result 
by bounding the norm of the interaction Hamiltonian by the inverse of the (physical) distance 
between the qubits that it correlates. The details of this result will not be discussed in detail in 
this paper. 

4 Objections to Fault-Tolerant Quantum Computing 

The first objection that comes to mind when considering FTQC is that we may simply not be able 
to construct quantum computers that are shielded from noise well enough so that the strength 
of the noise that makes it through to the computer is below the proved threshold value. So the 
question arises if bounds can be given for the noise strength above which FTQC is impossible. 
For certain noise models such bounds have indeed been proved. For example it was shown by 
Buhrman, Cleve, Laurent, Linden, Schrijver and Unger in [BCL + 06] that quantum computers 
cannot withstand "depolarizing noise" that hits with probability p w 0.45. Depolarizing noise is 
noise that acts independently on single qubits and replaces a qubit with the completely mixed state 
( o 1/2 ) wn -h probability p and leaves it untouched with probability 1 — p. A bound of p ps 0.36 
was later given by Kempe, Regev, Unger and de Wolf in [KRUdW08] for a slightly weaker noise 
model. 

This section is devoted to qualitative objections rather than such quantitative ones. We have 
seen that any error on a single qubit can be corrected by for example the Shor code. When a 
superposition of states is exposed to such an error, it impacts all terms of the superposition in the 
same way. We can also think of noise that only hits certain terms in a superposition, while leaving 
others undisturbed. Such noise we call controlled noise, because it is applied only to terms that 
meet certain conditions. In this section we will deal with two types of controlled noise; controlled 
bit flips (X errors) and controlled phase flips (Z errors). Ben-Aroya and Ta-Shma show in [BT11] 
that controlled bit flips cannot be perfectly corrected, but can be approximately corrected. On the 
other hand, they also show that controlled phase flips cannot even be approximately corrected. 
We will treat these subjects in that order. 

4.1 Controlled bit flips cannot be perfectly corrected 

The well known CNOT gate, for 'controlled not', acts on two qubits and flips the second only if 
the first is |1). We generalize this definition, allowing controlled bit flips to be conditioned on any 
number of qubits in the state, in any combination. We let [n] denote the set {1, . . . ,n} and say 
that for any i £ [n] and S C {0, l}™ -1 the operator E it s applies X to the i'ih qubit conditioned 
on the other qubits being in S. More formally if x £ {0,1}" we define x~i £ {0, l}™^ 1 to be 
X\, . . . , Xi-i, x i+ i, . . . , x n . We let X 1 be the operator that flips the z'th qubit of an n-qubit state, 
i.e., % X ® Now we say that for i £ [n] and S C {0, l}™" 1 , 



To obtain the full definition we extend the above one linearly. 

We are now ready to define the classes of errors of interest to us. We start with £ c bit for 
controlled bit flips, which we define as 



1 In [AH07] Alicki provides comments on that result. In [AliOl] he comments on the result by Terhal and Burkard 
([TB05]) that the work by Aliferis, Gottesman and Preskill ([AGP06]) presented in this section is based on. These 
comments mainly deal with physical objections to the proposed noise models. As a computer scientist, this author's 
understanding of such objections is unfortunately too limited to gauge their impact. 




febit :={A,S |i€ [n],SC {0,!}"- 1 }. 



16 



For the proof that controlled bit flips cannot be perfectly corrected we restrict our attention to a 
subset of errors called £ singletons, defined by 

^singletons := {E i>{s} \ 1 G [tl], S G {0, l}"" 1 }. 

To be able to rigorously formulate our theorems we must first have a proper definition for what 
it means that a QECC is able to correct an error. We let L(Af,AT') denote the set of all linear 
operations from the space A/" to the space N' C JV. 

Definition 4.1. A QECC M. corrects £ C L(J\f,Af') if for any two operators A,Be£ and any 
two codewords (p,ip E M, 

(ip\i>) = 0^ (ip\A*B\ip) = 0. 

During the proof of the first theorem we will make use of the following lemma which gives us 
a necessary condition for a QECC being able to correct an error. This lemma is a weaker version 
of Fact 2.1 in [BT11]. 

Lemma 4.2. If a code M. corrects £ C L(Af,Af'), then for any two operators A,Be£ and any 
two codewords (p, ip € M., 

{ V \A*B\ V ) = {ip\A*B\il}). 

Proof. Let A, B be any two operators in £ and let ipi, ip 2 be two basis vectors of M.. If ipi = tp 2 
then we already have (ip±\ A* B \ipij = (tp 2 \ A*B \<p 2 ), so assume ipi ^= tp 2 . Since these are basis 
vectors we have that ipi + (f 2 and <pi — <p 2 are codewords themselves and also orthogonal (we omit 
the normalization for brevity). By Definition 4.1 we thus have that 

(ip 1 + ip 2 \A*B\ip 1 -ip 2 ) =0 

and from that 

(<Pi \ A* B \ Vl ) - (V!\A*B | V2 > + (<p 2 \ A* B \ Vl ) - (p 2 | A* B \y 2 ) = 0. 

Again from Definition 4.1 we have that (<p\\ A* B \ip 2 ) = = ((p 2 \ A* B \<pi). Thus we obtain 

(<p 1 \A*B\<p 1 ) = {<p 2 \A*B\<p 2 ). 

Now let {<fi} be some orthonormal basis of M. and let <p = ^ a, \<pi) be some codeword in M.. 
We compute that 

(<p\A*B\<p) = £>* (tpiVA'Bfcai |^» = $>?aj (fPi\A*B\<p s ) . 

i i i,j 

Note that for i ^ j we have (<Pi\<Pj) = = (ipi\ A*B \<pj) by Definition 4.1, so we are left with 
(<p\ A* B \<p) = <<* (<Pi\ A * B M = E H* fal A * B M ■ 

i i 

As we have shown (<pA A*B \<pi) = (ipj\ A*B \ipj) for any two basis vectors ipi, cpj of A4. Therefore 
letting ipi be some arbitrary basis vector of M. we now arrive at 

(<p\ A*B \<p) = (<pi\A*B ■ Y = (Vi\A*B , 

because tp is a codeword and thus assumed to be unitary. □ 

Now we are ready for the first negative result. We remind the reader that a QECC A4 with 
dim(.M) = 2° is a space consisting of a single vector. By definition this means that such a code 
cannot be used to encode two different qubits, say |0) and |1), because their encoded versions 
would be identical, i.e., |0) = |l), and neither is recoverable. In particular such a QECC would 
not be able to correct any errors and as such cannot play a role in bringing about fault-tolerant 
quantum computing. 



17 



Theorem 4.3 ([BT11, Theorem 3.1]). There is no QECC M with dim(TW) > 2° that can correct 

£■ singletons • 

Proof. Assume there is a QECC M. with dim(A^) > 2° that can correct £ s i ng ietons and let ip = 
SiG{o i}" ^(*) I*) an( ^ ^ = Sie{o i}" ^(*) K) De tw0 orthonormal codewords of .M. Fix i € [n] and 
q E {0, 1}™. Denote E = I*E h{q _ t } and q' = q® e*. 
We can now compute that 

<d 25 |p) = + (^(g') - <p(q)) \q) + {<p(q) <ptf)) \q')) 

= (<p\<p) + (<p\ (Mq>) - <p{q)) \q) + (<p(q) - V {q')) \q')) 
= 1 + <p{qy(<ptf) - <p(q)) + <p(q')*(<p(q) ~ <ptf)) 

= i - <p( q y(<p(q) - <p(4)) + rtq'TMq) - </>(</)) 

= 1 - Mq)* - p(qr)Mq) - ^f(q')) 
= l-Mq)-<ptf)n<p(q)-<pW)) 

= l-\<p(q)-<ptf)\ 2 . 

Analogously we obtain that E\ip) =1 — \ip(q) — ip(q r )\ 2 . Furthermore 

{ V \E\i>) = { V \E\^)-{^) 
= {<p\(E\V)-\V)) 

= (v\mq')-m)\q) + Mq)-^(q'))\q')) 
= <p( q ywtf) - Hq)) + f{q')*mq) - W)) 
= -ip(qywq) - W)) + v(q'y(i>(q) - Mi')) 
= -(v>(qy-v(q'y)(i>(q)-Tp(q')) 

= -(v(?)-^)rw?)-^)). 

Because (<p\ip) = it follows from Definition 4.1 that (<p\ E\ip) = as well. So cither (ip(q) — 
tp(q')) = or (ip(q) — ip{q')) = 0. In case of the former we have that <p(q) = <p(q'). In case of the 
latter suppose for contradiction that tp(q) ^ f(q')-, then ij)(q) = ipiq')- From this we may conclude 
that (tp\ E \ip) = = 1, while (tp\ E \ip) ^ 1, which contradicts Lemma 4.2, letting A — I and 

B = E i ig_.y. So for any codeword tp and any i e [n] we have that <p(q) = *p{q © ej). So all 
codewords are completely uniform and hence there is only one codeword, up to multiplication by 
a scalar. This contradicts the assumption that there is a QECC M, with dim(A^) > 1 that can 
correct f singlet ons • 1 — 1 

We can give the following intuition for this result. Consider the state \<p) = -^(|00) + |11)), 
i.e., the EPR pair, and let E — £?x,{0}- We will need at least 2 ancilla qubits to write down 
the error syndrome for this state. An X error could hit either qubit and it is possible that 
no error occurs. We let |synd(X 1 )) denote the state of the ancillas when an X error has been 
detected on the first qubit and |synd(2)) denote that when no error has occurred. Now E\ip) = 
^7g(|10) <8> |synd(A 1 )) + |11) (g> |synd(J))). Measuring the syndrome will now cause the state to 

collapse to either |10) ® |synd(A 1 )) or |11) £5 |synd(/)) and after error-correction we are left with 
either |00) or |11). 

This is very different indeed from the case where the error itself is a linear combination of Pauli 
errors, for example let E — -^(X 1 + Z 2 ) and consider E\tp). Measuring the syndrome makes the 

state collapse to either X 1 \ip) ® |synd(X 1 )) or Z 2 \<p) (g> |synd(Z 2 )). In either case ip is recoverable 
by applying X 1 or Z 2 respectively. 

4.2 Controlled bit flips can be approximately corrected 

Even though we have seen that controlled bit flips cannot be perfectly corrected, not all is lost. 
Fortunately Ben-Aroya and Ta-Shma also proved in [BT11] that they can be approximately cor- 
rected. This result provides a positive intermezzo in a section otherwise devoted to negative results. 
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To present the proof we must first define what we mean by a QECC approximately correcting an 
error. 

Definition 4.4. Given a QECC M. and £ C L(Af,Af') we say that M. is (£,e) immune if for any 
operator A £ £ and any ip £ M., 

MA\ V )\^{l-e){^), 
where we call e the approximation error. 

The motivation for this definition is that for small enough e, A\ip) ss \(p), so there is no need for 
any kind of active error correction. Our goal will be to show that there is a QECC A4 with high 
dimension that can approximately correct £ c bit with a low approximation error. We have seen in 
the proof of Theorem 4.3 that to perfectly correct controlled bit flips the vectors in the code space 
must be very uniform. Unfortunately to make them uniform enough to perfectly correct the errors 
means limiting dim(A^) to 1. Our job will therefore be to conduct a careful balancing act between 
the uniformity of the vectors in the code space and the dimension of the QECC. 

To reason about the uniformity of vectors we first look at the influence of variables on func- 
tions. For a function / : {0, l} m ->Cwe can ask what the influence of a particular bit of the input 
is on the output. We define the influence of the i'th bit as /,(/) = E x e{o.i} m \f( x ) ~~ f( x © e i)| 2 ) 
where is the z'th vector in the standard basis. The influence of a function is then defined as 
/(/) = maXj S [ m ] Ii(f). We can view / as a vector X^g{o i} m f( x ) \ x )- Observe that a low influence 
of / corresponds to uniformity of the vector representation of /. 

Now we begin the construction of our QECC M.. Let B be an integer such that 2B divides 
n and define n' = ^g. Fix some balanced function / : {0, 1}™ — > {±|} with low influence s(n'). 
Balanced here means that Yl x e{0 i}"' = ®- Thus we have /(/) = s(n'). We can view a 
bitstring x of length n as a sequence X\ ■ ■ ■ xg of B strings of length 2n' . We further subdivide 
each such string into two strings of length n' so that we may write x = £i, 02^1,1 ■ • • x b,o x b,i- Let 
z be some string in {0, 1} B . We now define f z : {0, 1}™ — > C by f z (x) — J\k=i f( x k,z k ) and define 
our code M. := Span{/ 2 | z £ {0, We are now ready to formally state the theorem. 

Theorem 4.5 ([BT11, Theorem 4.1]). M is an {n,B} QECC that is {£ c bu,2s(n')) immune. 

To show that dim(A^) = 2 B it suffices to prove that {f z | z £ {0, 1} B } is an orthogonal set. 
So let z,z' £ {0, 1} B with z 7^ z'. We need to show that {f z \f z >) = 0. By our choice of / we 
know that f(xi, Zi ) is balanced over {±5}. The definition of f z as a product of /s then tells us 
that f z is balanced over {±2~ B }. Since z 7^ z' there is some k £ [B] such that Zk 7^ z' k . Now 
observe that f{xk, Zk ) and f(xk >z ') depend on non-overlapping substrings of x and hence their 
values are independent and uniform over {±5}- Therefore the pair (f z (x),f z '(x)) is uniform over 
(±2- B ,±2~ B ) and so 

(f z \f z >)= £ f z (xy,f z ,(x)=0, 
ze{o,i}" 

as desired. 

To prove the theorem we still need to show that for all A £ £ c t>it and all ip £ M. we 
have |(c/?|A|<p)| ^ (1 — 2s(n'))(tp\ip) . Note that it suffices to prove that \(<p\ A \ip) — (<p|</?)| ^ 
2s(n') \{(p\<p)\, because 

\{cp\A\ i p)-(<p\cp)\^2s(n')\{cp\<p)\ 
\(<p\<p)\- \(<p\A\<p)\ <2 S (n')K*#>| & 
\(<p\A\v)\- M<p)\ >-2a(n') |<^}| & 

\(<p\A\<p)\>\(<p\<p)\-28(n')\(<p\<P)\ 
\(<p\A\<p)\>(l-2 a (n'))\(<p\<p)\. 

The first implication is a consequence of the reverse triangle inequality, i.e., 1(^1^)1 — \{ip\ A \tp}\ ^ 
— (ip\ A \<p)\, which in turn equals \(ip\ A \<p) — (ip\tp)\. We shall prove the first inequality and 
hence the theorem using the following two lemmas. 
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Lemma 4.6 ([BT11, Lemma 4.3]). For every <p € M., A £ £ C Mt and i £ [n], 

\(cp\A\ t p)-( ( p\p)\^2 n - 1 I i (cp). 

Proof. We begin by observing that to prove the lemma for all A £ £ c hit we must show it for all 
error operators £^s, where i £ [n] and S C {0, 1}™. For the remainder of this proof we will treat 
vectors <p, ip £ M. as functions h, g : {0, 1}" — > C. Fixing some i and S we have 

(h\ Ei, s \g) = h{x)*g{x)+ Kx)*g{x®ei) 

= Kx)*g{x)+ (h(x)*g{x®ei)-h(x)*g{x)). 

We now write x £ {0, 1}" as a tuple (x-i, Xi). This gives us 



\(h\E itS \g)-(h\g)\ 



J2 (%, 0)*9(V, 0) ~ h(y, 0)*g(y, 1) - h(y, l)*g(y, 0) + h(y, l)*g{y, 1)) 



y&S 



J2(Hy,0T-Hy,iy)(g(y,0)-g(y,l)) 



y&S 



< f£ \h(y,0)* - h(y, 1)*| 2 f£ \g(y,Q) - g(y, 1) 
V yes V yes 



<J E I%,0)*-%,1) 



/ ]T \g(v,o)-g(y^W 



= ^-^(hW^Iiig), 



where the first inequality follows from the Cauchy-Schwarz inequality. Letting g — h and observing 
that we picked i and S arbitrarily, this proves the lemma. □ 

Lemma 4.7 ([BT11, Lemma 4.4]). For every ip £ M., 

2»- 1 I(p)<2 S (n')Kvb>|. 

Proof. Fix some i £ {1, . . . , n} and suppose that i is the j'th bit in Xk,b- We are given a ip £ M. 
and write tp = J2 z e{o i} B a zfz- Note that if we can bound I%{tp) for our arbitrarily chosen i, then 
we can bound I(ip). We start with 

k{<p)= E \(p(x) - ip(x ® e,)\ 2 

xG{0,l}" 



E 

a;G{0,l} r 



E a z(fz(x) - f z {x® ei)) 

z£{0,l} B 



and observe that the f z for which f z (x) = f z (x® e,) do not contribute to the sum. Therefore 



h{f) = E 

xe{osy 



E 

cG{0,l} r 



a,(/»W-/«(*®ei)) 



z:z k =b 



E "• | • (fi x l,zt))(f( x k,b)-f(Xk,b®ei)) ]~J (f( x t,et) 
z:z k =b l^Kk k<e^B 
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Now observe that the factor f(xk,b) — f{xk,b ffi &i) does not depend on z. We write x for x without 
the fc'th block (xk t o and Xk,i), so x :— x\, . . . , Xk~i, Xk+i, ■ ■ • , xb- 

2 



Ii(<p)= E \f(x k ,b) -f{xk,b®ei)\ ■ E 

xG{0,l}"' ie{o,i}™- 



z:z k =b l^l^BAl^ik 



To improve readability we define f z = ili^BAi^fc f( x l,m) and </? = E 2£ {o,i} B a zfz- Note that f z 
does not depend on Xk.b and neither does 0. Also note that by definition s(n') is an upper bound 
for the influence of /, so we write 



Ii(tp) < s(n') • E 

ie{o,i}™- 2 " 



We had previously shown that the f z are orthogonal, so 



— Z)ze{o,i} B \ a 



(M/z). By 



an analogous argument the f z are also orthogonal so — i^ z - Zk= b 

uniformity of / over {±±} it follows that (f z \f z ) = 2 n {\) B and (/,]/,) = 2 n ~ 2n ' {\) B - 1 
that tells us that (f z \f z ) — 2 2n ~ 2 (/ z |/ 2 )- We now compute 



, By the 
Together 



E 

ie{o,i}"- 2 " 



i#±)r 



- {n - 2n,) Mv)Y 

-(n-2n') | c 



(/*!/* 



= 4-2" 



z:z fc =6 



< 4 



= 2 



2"" ^ 1^1" </,|/,> 

zG{0,l} B 

2- (n " 1) |<^)| • 



Now it immediately follows that for any i £ [n] we have 2" ^ 2s(n') \(cp\ip)\ and hence 

2"- 1 /( ¥ >)<2 S (n')K¥#)|. ' ' □ 

In [BL89] Ben-Or and Linial define the "tribes" function, which splits its n'-bit input into 
blocks of « logn' — clog log n' bits each. Each bit is considered a Boolean variable and tribes 
first computes the conjunction of the variables in each block and then outputs the disjunction of 
those conjunctions. The constant c is chosen so that the function becomes approximately balanced. 

The influence of tribes is O ( " ^ , so when we use the tribes function to define f(x) = 1/2 if 
tribes(x) = 1 and f(x) = —1/2 if tribes(x) = 0, then the theorem implies that for every n and B 
such that 2B divides n there is an [n, BJ QECC that is (£ c bit, 0( B1 ° s ^ n ^ B ^ )) immune. In particular 



when we let B — y/n, this shows that there is an fn, y/nj QECC that is (£ c t>it, 0( 



logOy/n) 



)) immune. 



4.3 Controlled phase errors cannot be approximately corrected 

We return to the negative results by considering controlled phase flips. For S C {0, 1}™ and 



9 £ [0, 2tt) we can define the error operator Es,g by Es t e \x) 



x) if x £ S and Eg o \x) 



otherwise. This lets us define the set of error operators that are controlled phase errors as 

fcphase := {E s ,9 \SC{0, 1}" and 6 £ [0, 2tt)}. 
Note that for every partition S = (Si, S2, S3, S4) of {0, 1}™ the set £ C phase contains the oper- 



ators Es 2 us 4 ,-% and Es 3 us it w In the following we will let Eg stand for E S2USi 
now observe that for all x £ {0, 1}" we have 



:-Es 3 US 4 ,7r- We 



%|x) 



\x) 

e% 1 \x) 
e" \x) 



e 2 



\x) 



if x £ Si 
if x £ S2 
if x £ S3 
if x £ S4. 
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We will show that there is no non-trivial QECC that can correct £ C phase- Non-trivial here means 
that the QECC must have more than one codeword. In fact we will prove something stronger, 
namely that no non-trivial QECC can separate £ cp hase with reasonable error. We define this as 
follows. 

Definition 4.8. A QECC M. separates £ C L(J\f,Af') with at most a error if for any two operators 
A, B £ £ and any two codewords <p, ip € M., 

{<p\iP)=Q^\{<p\A*B\ip)\ <a. 

The motivation for this definition is that if a QECC cannot separate a set of errors, then errors 
from that set can turn orthogonal states into non-orthogonal states. That in turn means that they 
can no longer be perfectly distinguished by any quantum measurement, and so in particular the 
error cannot be perfectly corrected. To prove the theorem we first need a small lemma. 

Lemma 4.9 ([BT11, Lemma 5.2]). Let M. be a vector space with dim(.M) > 1. Then there are 
two orthonormal vectors ip,ip G M. such that \ip(x)\ ■ \ip{x)\ 1/2. 

Proof. Let ip, ip € A4 be two orthonormal vectors and let p' — -^((p + ip) and ip' — -^=(<p — ip). 
Then 

]T W{x)\ ■ W{x)\ = i £ \p(x) + 1>{x)\ ■ \<p{x) - iP{x)\ . 

X X 

Fixing some x € {0, l} n and assuming without loss of generality that |^(x)| ^ |VK X )I we nn d 

\<p(x) + ip(x)\ ■ \<p(x) - ip(x)\ > \(<p(x) + ip(x))(ip(x) ~ i>(x))\ 

= \p(xf - iP(x) 2 \ 
^\p(x)\ 2 -\^(x)\ 2 

> \cp(x)\ 2 + \iP(x)\ 2 -2\<p(x)\-\iP(x)\, 
where the last inequality is because |y(x)| ^ |^>(x)|. Now we can write 

w{x)'\ ■ h{x)'\ > \ ^(k(x)i 2 + h(x)\ 2 ) - ■ mx)\ = i-J2 \<p(*)\ • , 

XX XX 

so either J2 X \v( x )\ ' \*P( X )\ or Yl x \f( x Y\ ' IV'WI is at least 1/2. Therefore either p, ip or tp', ip' are 
our witnesses. □ 

Now we are ready to prove the theorem. 

Theorem 4.10 ([BT11, Theorem 5.1]). There is no QECC with dimension 2 that can separate 
Zmhase with error a € 4 . 



Proof. We need to show that for some A,E>£ £ C phase and some unitary tp, ip G A4 we have both 
~ • ■ — 11 '' J A*B \ip)\ > i. So it suffices to show that for some S we have \(ip\ Ej\ip)\ > i 



for some p, ip. 

Let (ffi/j be as in Lemma 4.9. We may express p{x) — r x e B * % and ip(x) — r' x e 6:cl , where r x = 
\ip(x)\ and r' x = \ip(x)\. Letting 9 X = 0, 2 = f , 03 = f and 4 = 2f we define S = (S U S 2 , S 3 ,S 4 ) 
by putting x € Sj( x ) where j(x) — argmin je r 4 i{|— 9 X + 9' x + 9 A mod 27r}. Observe that 

mm{\-d x +9' x + 9 A mod 2tt} sC 

j6[4] 4 

because 9' x — 9 X modulo 2it is in [— tt, it). 

We now let ( x — —9 X + 9 X + 9j (note that 9j depends on x) and u x — 1 — e"*. This lets us 
write 



r x r' x e^ 1 

xe{o,i} n 



r x r' x {l-u x ) 

x£{0,l} n 
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We now compute 



Kl'^l-costOf+sin 2 ^) 

= 1 - 2cos(Cz) + cos 2 (Cx) + sin 2 ((x) 

= 2(1-008(0). 

Note that ( x <E [0,7r/4]. Since the cosine is increasing in that interval, cos(Cz) is maximal at 7r/4, 
where its value is l/y/2- Hence 2(1 — cos^)) ^ 2 — \/2 and \u x \ ^ \Jl — \/2. This lets us further 
derive 



E 



x£{0,l} r 



E r * r 'x U x 



xG{0,l}" 



> [1- max \u x \ V r x r x 



1-V2-V2 E 



ze{o,i} r 



By our choice of y> and ^ we have that X^e{o i}« r;rr a; ^ ^/^' therefore 



K ¥ ,|^|^)|>(l-V2-v^)i>^ 



as desired. 



□ 



Observe that the proof shows that we cannot even separate the restriction of £ C phase to rotations 
over 7r and — 7r/2, let alone the arbitrary rotations the unrestricted £ C phase allows. 



5 More speculative objections 

In this section we present more objections to FTQC, namely a selection of objections put forth 
by Kalai in [Kal08, Kal09, Kalll]. Although these objections are less precise than those presented 
in the previous section, Kalai tries hard to identify possible flaws in the theory of fault-tolerant 
quantum computing as it exists today. Regardless of how well-founded these objections will turn 
out to be, the work of Kalai is very valuable to help us better understand the nature of noise that 
impacts quantum systems and how such noise can be guarded against. 



5.1 Noise propagation 

Most of Kalai's objections deal with the assumptions made about the noise models for which 
threshold theorems have been proved. His first such objection deals with noise propagation, the 
way in which errors occurring at a particular time during the computation spread across the circuit 
as the computation proceeds. In Sections 2 and 3 we assumed that 1-Recs could be constructed 
that met conditions 1 through 5 on page 6, which limit noise propagation. In particular we assumed 
that we could limit the accumulation of noise, by removing all the noise every time we ran our 
error-detection and error-correction procedures. Kalai suggests that modeling noise propagation is 
fundamental to modeling noisy quantum systems and that we should identify the mathematical 
properties of noise propagation. In [Kal09, Section 6.2] and [Kalll, Section 6] Kalai proposes such 
a property and conjectures that fault-tolerant quantum computing is impossible for noise models 
having that property. 
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5.2 Preparing codewords 



Another objection has to do with our ability to encode quantum states using a QECC, i.e., to 
prepare codewords. In our discussion of fault-tolerant quantum computing we have assumed that 
a qubit preparation rectangle has at most one error in its output. In particular, this implies that 
the state generated by the rectangle does not contain a superposition of codewords. Rather, it may 
be a superposition of a codeword with one or more states that are not codewords. That allowed 
us to perform the error-detection and error-correction steps as outlined in Section 1. Naturally, 
when the output of such a rectangle is a superposition of codewords, we can neither detect the 
error nor correct it. Kalai proposes as Conjecture 1 in [Kalll] that the act of preparing encoded 
qubits inherently results in a superposition of the intended codeword and undesirable codewords. 

5.3 Error synchronization 

Kalai's main objection, however, is based on a physical conjecture. For the threshold results pre- 
sented in this paper it was assumed that the spatial and temporal correlations between errors are 
either non-existent (Section 2) or highly localized (Section 3). When we consider the probability 
distribution of the number k of errors hitting an n-qubit state, a consequence of that assumption 
becomes that beyond the expected value for the number of errors, the probability decreases expo- 
nentially with k. In other words, the probability distribution of the number of errors has a small 
tail. Kalai observes that the QECCs used to prove these threshold results generate and operate 
on highly entangled states. Formal definitions can be given for measures of the entanglement of 
(mixed) quantum states, but we forgo giving them here. He makes the physical conjecture that 
errors hitting such entangled states will be highly correlated, a phenomenon he calls error syn- 
chronization. In [Kal09, Section 7.2] Kalai makes this conjecture more precise. The impact of 
error synchronization on FTQC can perhaps best be understood by taking a small detour back to 
the classical world and considering the effect of error correlation on binary strings. We will prove a 
lemma demonstrating that this effect is that we can no longer assume that our distributions have 
small tails. This is a generalization of Lemma 1 in [Kal08], which is also referenced as Proposition 
6 in [Kal09]. As such, if Kalai's conjecture about error synchronization turns out to be true, this 
could have serious repercussions for fault-tolerant quantum computing. 

In the following we will let [n] stand for the set {1, . . . , n}. A binary string of length n can 
be seen as an indicator string for errors, where a 1 indicates that an error has occurred on that 
position and a indicates that no error has occurred. Given a probability distribution T> on binary 
strings x = x\ ■ ■ ■ x n of length n and i, j £ [n], we define the pairwise correlation Cij{T>) to be 
Pr^D {xj = 1 | Xi = 1). This definition is the author's interpretation of Cjj as it is used, but 
not defined in [Kal08]. Note that cu(D) = 1 for all T> and i € [n], but this does not matter as 
our lemma will only assume a lower bound on Cij(T>). For a binary string x we let \x\ denote the 
Hamming weight of x. 

Lemma 5.1. Suppose that T> is a probability distribution on binary strings of length n and let s 
be such that for all i,j G [n], Cij(T>) ^ s. For binary strings y and z, y =<! z means that y is an 
initial segment of z. Then 



Proof. All probabilities are assumed to be according to T>, i.e., x ~ T>. We start by observing that 
for i = 0, . . . , n — 1 

E(M | 0*1 ^x) >l + (n-i- l)s > (n - i)s, 

because the (z+1) 'st position is 1 and each of the (n—i—1) positions after that are 1 with probability 
^ s, because the (i + l)'st position is 1. Furthermore we can upper bound this expected value as 

E(|x| | l l x) ^ Pr(|x| sn/2 | l l ^ x)sn/2+ 




n-l 



Pr(|x| > sn/2 | O'l ^ x)n, 
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for i = 0, . . . , n — 1. Combining these equations and using the fact that Pr(|x| ^ sn/2 \ l l =4 x) = 
1 — Pr(|ar| > sn/2 \ l l =4 %) we compute that 

Prflzl > sn/2 I 0*1 ^ x) > s / 2 ~ s */ n . 

1 — s/2 

Now we can use this equation together with the observation that 

n-l 

Pr(|a;| > sn/2) = ^ Pr(0n x) Pr(|a;| > sn/2 | 0*1 ^ x) 

to obtain the result. □ 

This lemma shows that when errors are correlated, the distribution of the number of errors 
hitting a state has a fat tail, i.e., beyond the expected number of errors the probability decreases 
only polynomially with the number of errors, not exponentially as before. Writing out the first two 
terms of the bound we find that 

„ , s „ , x s/2 „ , . s/2 — s/n 

Pv( x > sn/2) > Pr 1 x)-^— + Pr 01 ^ xU 

1 — s/2 1 — s/2 

illustrating that the decrease is indeed polynomial. This violates the assumption we made in 
Section 2 that qubits are hit by errors independently. In Section 3 we made no such assumption, 
but this result suggests that the noise strength r\ might be too large, i.e., above the threshold. 
Therefore if Kalai's conjecture that highly entangled states lead to error synchronization is correct, 
then the current threshold theorems do not apply for realistic noise models and the possibility of 
fault-tolerant quantum computing again becomes an open question. 

We observe that the bound shown in the lemma is close to optimal, for let V be a distribution 
where x = 0" with some probability p £ (0, 1) and x — 1" with probability 1 — p. Then for all 
i, j £ [n] we have Cj,-(X>) = 1 and 

1/2 

Pr (\x\ > n/2) = Pr(l x) '—— = Pr(l 4 x) = p. 

x~v 1 — 1/2 

6 Conclusions and outlook 

The goal of this survey was to give an overview of the current state of FTQC, to list important 
positive and negative results and to show that a large, gray area remains largely unexplored in 
between. In Sections 2 and 3 we presented some important positive results, namely that thresholds 
for fault-tolerant quantum computing can be established for a number of noise models. Although 
the exact numerical value of these thresholds is of great practical importance, the differences 
between proved minimal and maximal values are still several orders of magnitude. Closing in on 
exact numerical values for thresholds under various noise models and using various QECCs remains 
an important research goal in the field of fault-tolerant quantum computing. 

The noise models for which we presented threshold results allow for only very weak spatial and 
temporal correlations between errors. One direction forward would thus be to prove that threshold 
results can be obtained for more strongly correlated noise models. In fact, any relaxation of the 
assumptions we made in Sections 2 and 3 would be of great value. 

In Section 4 we have seen that when we allow noise models to use the very entanglement that 
gives quantum computation its edge over classical computation against us, we should let go of 
the idea of perfectly correcting errors and instead focus on approximately correcting them. For 
controlled phase errors, however, even that will not be possible. Without allowing the noise to act 
conditionally in the sense of Section 4, we can still endeavor to obtain threshold results for error 
distributions that are not entirely independent. It may be possible to prove that fault-tolerant 
quantum computing is possible for correlated errors, when we also consider a threshold for the 
correlation of errors. 

As for obtaining more negative results, the objections put forth by Kalai deserve further study 
and formalization. In the end, experiments with actual noisy quantum systems will likely determine 
what will be considered physically realistic noise models. 
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